Automatic Induction of Bellman-Error Features for Probabilistic Planning

نویسندگان

  • Jia-Hong Wu
  • Robert Givan
چکیده

Every goal-oriented domain with a problem generator from the first or second IPPC (Younes, Littman, Weissman, & Asmuth, 2005; Bonet & Givan, 2006) was considered for inclusion in our experiments. For inclusion, we require a planning domain with a fixed action definition, as defined in Section 2.4 in the paper, that in addition has only ground conjunctive goal regions. Four domains have these properties directly, and we have adapted three more of the domains to have these properties as we describe in the next paragraph. The resulting selection provides seven IPPC planning domains for our empirical study. Figure 4 lists the reasons for the exclusion of the other six goal-oriented domains. In addition, four of the domains that we use in evaluation occur in both competitions in slightly different forms and we evaluate on one version of each of these four, as described in Figure 5. The three domains we adapted for inclusion are as follows. We created our own problem generators for the first IPPC domains TOWERS OF HANOI and FILEWORLD, as none were provided in the competition. For both these domains, there is only one instance of each size. In Towers of Hanoi, all instances share the same action set and state predicates, so that a suitable problem generator is straightforward. In Fileworld, a planning domain with a fixed action definition results if we consider the collection of instances that share the same fixed number of folders, but varying the number of files. When the number of folders varies, the state predicates and actions change, so that instances with varying numbers of folders cannot be in the same fixed-action-definition planning do-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering Relational Domain Features for Probabilistic Planning

In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the feasible problem size. We consider the problem of automatically finding useful domain features in problem domains that exhibit relational structure. Specifically we consider learning compact relational features withou...

متن کامل

Relational State-Space Feature Learning and Its Applications in Planning

We consider how to learn useful relational features in linear approximated value function representations for solving probabilistic planning problems. We first discuss a current feature-discovering planner that we presented at the International Conference on Automated Planning and Scheduling (ICAPS) in 2007. We then propose how the feature learning framework can be further enhanced to improve p...

متن کامل

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces

This paper addresses the problem of automatic generation of features for value function approximation in reinforcement learning. Bellman Error Basis Functions (BEBFs) have been shown to improve policy evaluation, with a convergence rate similar to that of value iteration. We propose a simple, fast and robust algorithm based on random projections, which generates BEBFs for sparse feature spaces....

متن کامل

Towards Clause-Learning State Space Search: Learning to Recognize Dead-Ends

The ability to learn from conflicts is a key algorithm ingredient in constraint satisfaction (e. g. [6, 24, 20, 22, 8, 2]). For state space search, like goal reachability in classical planning which we consider here, progress in this direction has been elusive, and almost entirely limited to length-bounded reachability, where reachability testing reduces to a constraint satisfaction problem, ye...

متن کامل

A Customer Oriented Approach for Distribution System Reliability Improvement using Optimal Distributed Generation and Switch Placement

The reliability of distribution networks is inherently low due to their radial nature, consequently distribution companies (DisCos) usually seek to improve the system reliability indices with the minimum possible investment cost. This can be known as system-oriented reliability planning (SORP). However, there can exist some customers that are not satisfied by their reliability determined by ado...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2010